A trial deep learning-based model for four-class histologic classification of colonic tumor from narrow band imaging

Narrow band imaging (NBI) has been extensively utilized as a diagnostic tool for colorectal neoplastic lesions. This study aimed to develop a trial deep learning (DL) based four-class classification model for low-grade dysplasia (LGD); high-grade dysplasia or mucosal carcinoma (HGD); superficially invasive submucosal carcinoma (SMs) and deeply invasive submucosal carcinomas (SMd) and evaluate its potential as a diagnostic tool. We collected a total of 1,390 NBI images as the dataset, including 53 LGD, 120 HGD, 20 SMs and 17 SMd. A total of 598,801 patches were trimmed from the lesion and background. A patch-based classification model was built by employing a residual convolutional neural network (CNN) and validated by three-fold cross-validation. The patch-based validation accuracy was 0.876, 0.957, 0.907 and 0.929 in LGD, HGD, SMs and SMd, respectively. The image-level classification algorithm was derived from the patch-based mapping across the entire image domain, attaining accuracies of 0.983, 0.990, 0.964, and 0.992 in LGD, HGD, SMs, and SMd, respectively. Our CNN-based model demonstrated high performance for categorizing the histological grade of dysplasia as well as the depth of invasion in routine colonoscopy, suggesting a potential diagnostic tool with minimal human inputs.

www.nature.com/scientificreports/ cancer 12 . However, criteria for classification described in empirical terms 12 may inevitably suffer from a variety of biases in evaluation leading to different accuracy varying with endoscopists and disturb comparison of accuracy among different endoscopist communities. This study aimed to develop a trial CNN-based supervised learning model for evaluating histologic atypism or invading depth from NBI images of detected colonic neoplastic lesions and evaluate the potential as a diagnostic tool.

Methods
Preparation of endoscopic images. NBI images of neoplastic lesions from patients who underwent endoscopic or surgical resection at Sendai City Medical Center Sendai Open Hospital from April 2017 to December 2019 were used for this single center retrospective study. Characteristics of collected NBI images are summarized in Table 1. A total of 1390 NBI images were sampled from a total of 210 lesions with definite histologic diagnosis 13 : 53 low-grade dysplasia (LGD); 120 high grade dysplasia or mucosal carcinoma (HGD); 20 superficially invasive (the depth of the invasive front < 1000 µm) submucosal carcinoma (SMs) and 17 deeply invasive (the depth of the invasive front > 1000 µm) submucosal carcinomas (SMd). Pathological diagnosis was conducted by pathologists unaware of the study design in a blinded manner. The diagnosis of a mucosal lesion, LGD or HGD was assigned to the most severe grade regardless of the size of the component. Sampled picture number per lesion was 5.5 to 7 samples with an averaged image capturing conditions: no magnification 41.0%; low magnification 37.9%; high magnification 21.1%. The images of a solitary lesion at varying magnifications were carefully chosen to minimize potential bias in the selection process. The video endoscopes CF-HQ290ZI, PCF-H290ZI, PCF-H290TI and video endoscopy system EVIS LUCERA ELITE CV-290/CLV-290SL (Olympus Medical Systems, Co., Ltd., Tokyo, Japan) were used.
Preparation of dataset. NBI images (Fig. 1a) were manually partitioned into the lesion (Fig. 1b) and background ( Fig. 1c) from which the patch images (128 × 128 pixels) were cropped starting from the left upper corner (white dotted patch), rightwards (white solid patch), then downwards (red solid patch) at every 32-pixelstrides (white and red arrows) over the entire effective region of interest. The patches including blackouts with more than 10% of the effective region were automatically excluded from analysis. Blackouts were defined as regions with the intensity of red component lower than 50. Similarly, the patches with halations exceeding 5% of the effective region were also excluded. Halations were defined as regions with the intensity of green component higher than 250. In this study, the patches were further classified into in-focus patches and out-of-focus ones according to the amount of spatial high frequency area extracted by high pass filter with a cut-off of 6.25% Nyquist frequency. The in-focus patches were classified into (0) background (BG), (1) LGD, (2) HGD, (3) SMs  www.nature.com/scientificreports/ and (4) SMd, and the out-of-focus ones into (5) background (BG-oof) and 6) lesion (L-oof). A total of 598,801 patches were classified into 7 categories ( Table 2). The study did not have any inclusion or exclusion criteria for pictorial quality of the patches by endoscopists. As stated, the patches with excessive blackout or halation were automatically excluded before entry. The study aimed to establish an effective histologic classifier that can be used in any common shooting conditions of NBI.
Evaluation method. We employed cross-validation to obtain more accurate results with less bias in the machine learning studies 14 . In this study, the dataset is randomly partitioned into three equal sized folds, one fold of which is for validation and the other folds are for training. The proportion of labels was equal in each fold. The training and validation processes were repeated three times using different folds each time. The three validation results could then be averaged to produce a single estimation.
Architecture of the CNN. ResNet50 (a CNN) proposed by He et al. 15 and Pytorch were utilized. ResNet50 without pretraining was imported from Pytorch library (torchvision.models). The original patches with 128 × 128 pixels were converted into images with 224 × 224 pixels. We tuned hyper parameters, which were set by a human, as follows: optimizer, Adam; loss function, cross entropy loss; number of training epochs, 50; batch size, 256; learning rate, 0.00005 via trial and error; and number of the outer layers, 7 classes.
Image-level classification. An exemplification of SMd and the annotation mask without blackout or halation (denoted by X) are depicted in Fig. 2a,i, respectively. The patches classified into BG, LGD, HGD, SMs, SMd, BG-oof and L-oof, by the trained CNN, are illustrated by white (Fig. 2b), green (Fig. 2c), yellow (Fig. 2d), magenta (    Examples of the patch-based mapping and image-level classification. Figure 3 illustrates the examples of input images, patch-level prediction map and bar graph of IoU. In cases 1, 2, 3 and 4, the ground truth histology was consistent with the predicted histology with the maximum intersection over union. In cases 5 and 6, HGD and SMd were misclassified as SMd and HGD, respectively. In these cases, misclassification of the surrounding background into the true lesion resulted in a lower intersection over union of the true lesion compared to the misclassified ones. A type of misclassification, stemming from an underestimation of the actual lesion compared to misclassified lesions across four SMs, has likely caused a decrease in accuracy relative to other lesions.

Discussion
In this study, we developed a trial CNN-based multi-class histology classifier model for detected colorectal neoplastic lesion in routine colonoscopy still images with NBI mode in common shooting condition. The NBI offers a significant advantage for CNN-based image classification thanks to its ability to provide high contrast or detailed pictorial information without requiring any pre-acquisition preparation. The diagnosis process includes patchlevel histology mapping over the entire in-focus region of NBI image, trained on ResNet50 and the calculation of argmax among intersections over union between annotation mask and patch-level union masks for image-level histology. This model achieved an image-level accuracy of 0.986, suggesting its potential as a diagnostic tool. The advancement of machine learning using CNN has enabled physicians to apply CAD of medical images in their specialized field. The American Society for Gastrointestinal Endoscopy AI Task Force 16 stated that CAD plays a crucial role in screening and surveillance colonoscopy for colorectal cancer prevention. Similarly, a European Society of Gastrointestinal Endoscopy mentioned to the capability of AI for accurately predicting the histology of polyps from endoscopic images and improving the cost-efficiency and safety of colonoscopic Table 3. Patch-level three-fold validation accuracy. BG, background; HGD, high grade dysplasia; LGD, low grade dysplasia; SMs, superficially invasive submucosal carcinoma; SMd, deeply invasive submucosal carcinoma; BG-oof, out-of-focus background; L-oof, out-of-focus lesion. www.nature.com/scientificreports/ colorectal cancer screening and surveillance 17 . The supervised learning of a CNN has enabled the development of a model for automated detection of colon polyps 6 , as well as binary classification models for distinguishing adenomatous from non-adenomatous polyps (with a ten-fold validation accuracy of 0.751) 7 , adenomatous from hyperplastic diminutive colorectal polyps (with an accuracy of 94%) 8 , and neoplastic polyps from non-neoplastic polyps (with a high confidence rate of 0.85) 9 .

Category
However, a CNN model for multiclass differentiation among low-grade dysplasia, high-grade dysplasia, and carcinoma with superficial or deep submucosal invasion has not been developed. Although invading depth is determinant for therapeutic intervention (endoscopic resection or surgery), it has been evaluated so far by endoscopists with the use of knowledge-based criteria 12,18 which inevitably suffers from a variety of biases in evaluation. One study has reported a CNN-based binary class prediction model for deeply submucosal invasive carcinoma with an overall accuracy of 85.5%, which is comparable to that of expert endoscopists 19 . Although it must be done with caution, comparison of model accuracy between studies with different designs revealed a prediction accuracy of 0.991 for carcinoma with deeply submucosal invasion in this study. To accurately compare the accuracy of ML models regardless of algorithm and class numbers, the promotion of a benchmark data set library with annotation masks 20 is essential.
A patch-based CNN has been utilized for automated detection of a target area within a whole slide image in digital pathology 21,22 . This method has been recently applied for automated severity mapping along the entire colorectum in patients with ulcerative colitis from capsule endoscopy video files 23 , as it is considered to have an advantage when the object for classification is composed of topographically varying elements, such as severity or atypism. When reconstructing histologic maps in resected specimens, one often encounters topographic heterogeneity in the grade of dysplasia as well as invading depth. In this study, a single histology or label was assigned to a single lesion image, thus resulting in similar labels across the entire lesion area, which may have impacted the outcomes. None the less, a multi-class classification model developed from a trained patch-level classifier has achieved a high image-level accuracy of over 0.96, which may provide a potential diagnostic tool with minimal human input in routine colonoscopy.
The limitations of the study were: its single-center, retrospective nature, limited dataset size, and lack of external validation; moreover, other classification models including VGGNets, DenseNets and ViT were not explored; the applicability of the model to diagnosis in the subsequent endoscopy system is uncertain.

Data availability
The data generated or analyzed during this study are included in this published article. Some datasets generated and/or analyzed during the current study are not publicly available due to privacy but are available from the corresponding author on reasonable request. LGD, low grade dysplasia ; SMs, superficially invasive submucosal carcinoma ; SMd, deeply invasive submucosal carcinoma; IoU, intersection over union. www.nature.com/scientificreports/ Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.